Skip to content

Conversation

@guettli
Copy link
Collaborator

@guettli guettli commented Nov 3, 2025

This PR adds support for the upstream hcloud CCM (https://github.com/hetznercloud/hcloud-cloud-controller-manager/).

Until now, CAPH relied on our Syself fork, the "hetzner" CCM (https://github.com/syself/hetzner-cloud-controller-manager/).

The fork was required several years ago because the upstream CCM did not support bare-metal servers. Hetzner has integrated our bare metal support into its CCM (hcloud ccm PR 561), so now is the time to gradually phase out our fork.

The CCM is responsible for setting the ProviderID on Nodes. Our fork used the format hcloud://bm-NNNN for bare-metal nodes, while the upstream CCM uses hrobot://NNNN.

This PR makes both formats work, so you can use either the Syself CCM or the upstream hcloud CCM from hetznercloud.

If you use caph containing this PR, then it is safe to switch to the upstream ccm. Existing nodes will keep their legacy hcloud://bm-NNNN ProviderIDs. Only new nodes will get hrobot://NNNN. The upstream ccm can read the legacy format.

HCloud nodes are not affected by this change.

Changes

  • More docs for HetznerBareMetalMachineSpec.ProviderID
  • Create HetznerBareMetalMachineSpec.ProviderID in the new format: hrobot://NNNN. Before it was hcloud://bm-NNNN.
  • Validate that the Provider does not get changed (ValidationWebHook)
  • More docs for HetznerCluster.Spec.HetznerSecret.Name
  • To ensure compatibilty with both CCMs: If HetznerCluster.Spec.HetznerSecret.Name is not "hcloud", then create two secrets in the wl-cluster. One with name "hcloud", one with the name of HetznerCluster.Spec.HetznerSecret.Name. This ensures compatibility with upstream hcloud-ccm.
  • To ensure compatibilty with both CCMs: hcloud-ccm wants the HCLOUD_TOKEN to be on key "token" in the secret. Syself ccm expects the key to be "hcloud". Create both keys, so that both CCMs are supported.
  • Added an entry note to the secret in the wl-cluster. This helps new users to understand that this secret gets reconciled by caph.
  • Changed defaults in HetznerSecretKeyRef from "hcloud-token", "hetzner-robot-user", "hetzner-robot-password" to "token", "robot-user", "robot-password". This aligns to the upstream ccm values.
  • update CSR-Controller to support both: the legacy ProviderID (hcloud://bm-NNNN) and the new ProviderID (hrobot://NNNN)
  • renamed internal function reconcileTargetSecret to reconcileWorkloadClusterSecrets and rename some internal variables to make differentiation between mgt-cluster and wl-cluster more obvious.
  • Check that HCloud ProviderID starts with "hcloud"
  • ... TODO

TODO Release as caph v1.1.0

This changes a lot, additionally the last caph release was not done from main. This should be released as v1.1.0

This version get used already used in the docs. Please adapt, if a different version will contain that PR.

@github-actions github-actions bot added size/M Denotes a PR that changes 50-200 lines, ignoring generated files. area/code Changes made in the code directory area/api Changes made in the api directory labels Nov 3, 2025
Copilot AI added a commit that referenced this pull request Nov 3, 2025
@github-actions github-actions bot added size/S Denotes a PR that changes 20-50 lines, ignoring generated files. and removed size/M Denotes a PR that changes 50-200 lines, ignoring generated files. labels Nov 3, 2025
@github-actions github-actions bot added size/M Denotes a PR that changes 50-200 lines, ignoring generated files. and removed size/S Denotes a PR that changes 20-50 lines, ignoring generated files. labels Nov 4, 2025
@guettli guettli requested a review from janiskemper November 4, 2025 16:51
@github-actions github-actions bot added size/L Denotes a PR that changes 200-800 lines, ignoring generated files. area/hack Changes made in the hack directory and removed size/M Denotes a PR that changes 50-200 lines, ignoring generated files. labels Nov 4, 2025

// upstream hcloud-ccm uses the secret key "token", while the old Syself ccm used "hcloud".
// For compatibilty, we create always the other key, too.
for _, key := range []string{"token", "hcloud"} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now we create a duplicate right? Wouldn't it be enough to create the right key (e.g. "token" for the upstream ccm)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result should work for both CCMs.

By creating both keys we are safe. Users can try to upstream hcloud-ccm. If it fails, they can easily switch back to the old syself-ccm.

Why not create both keys?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you create both keys in both secrets. But you actually need only one key per secret right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it is less confusing for users if we have only the correct key in the secret. That's why I would implement it in that way

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: We deprecate HetznerCluster.Spec.HetznerSecret.Key.HCloudToken. In the docstring we explain that the controller will currently create "token", "hcloud" and the specified value.

In the long run (maybe caph version released at the end of 2027) we will remove that field and will only create "token".

I think it is perfectly fine if that value can't be configured.

What do you think about that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose so. Can you discuss this with @batistein?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janiskemper I talked to Sven. He is fine with creating three keys in the secret. But he does not want to deprecate this setting.

Copy link
Contributor

@janiskemper janiskemper Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay can you then update the comments in the function to make that clear that these three keys are created for the hcloud token?
EDIT: And this should be also documented I suppose

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janiskemper I added a comment to the function and to the related field in the struct:

623f05a

ok now?

@guettli guettli requested a review from janiskemper November 5, 2025 09:28
@guettli guettli requested a review from batistein November 11, 2025 10:11
type HetznerSecretKeyRef struct {
// HCloudToken defines the name of the key where the token for the Hetzner Cloud API is stored.
// We recommend to use "token", because this is the default of upstream hcloud-ccm.
// hcloudToken defines the name of the key where the token for the Hetzner Cloud API is stored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capital "HCloudToken"

// We recommend to use "token", because this is the default of upstream hcloud-ccm.
// hcloudToken defines the name of the key where the token for the Hetzner Cloud API is stored.
// We recommend to use "token", because this is the default of upstream hcloud-ccm, while the
// legacy Syself ccm uses "hcloud". For maximal compatibility up to three keys get created in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's write "Syself fork" instead of "Syself ccm"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api Changes made in the api directory area/code Changes made in the code directory area/hack Changes made in the hack directory size/L Denotes a PR that changes 200-800 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants